Data-Driven Filter-Bank-based Feature Extraction for Speech Recognition
نویسنده
چکیده
Selecting good feature is especially important to achieve high speech recognition accuracy. Although the mel-cepstrum is a popular and effective feature for speech recognition, it is still unclear that the filter-bank in the mel-cepstrum is always optimal regardless of speech recognition environments or the characteristics of specific speech data. In this paper, we focus on the data-driven filter-bank optimization for a new feature extraction where we use the Kullback-Leibler (KL) distance as the measure in the filter-bank design. Experimental results showed that the proposed feature provides an error rate reduction of about 20% for clean speech as well as noisy speech compared to the conventional mel-cepstral feature.
منابع مشابه
Data Driven Design of Filter Bank for Speech Recognition
Filter bank approach is commonly used in feature extraction phase of speech recognition (e.g. Mel frequency cepstral coefficients). Filter bank is applied for modification of magnitude spectrum according to physiological and psychological findings. However, since mechanism of human auditory system is not fully understood, the optimal filter bank parameters are not known. This work presents a me...
متن کاملIntegrated Phoneme Subspace Method for Speech Feature Extraction
Speech feature extraction has been a key focus in robust speech recognition research. In this work, we discuss data-driven linear feature transformations applied to feature vectors in the logarithmic mel-frequency filter bank domain. Transformations are based on principal component analysis (PCA), independent component analysis (ICA), and linear discriminant analysis (LDA). Furthermore, this pa...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملReverse Correlation for Analyzing MLP Posterior Features in ASR
In this work, we investigate the reverse correlation technique for analyzing posterior feature extraction using an multilayered perceptron trained on multi-resolution RASTA (MRASTA) features. The filter bank in MRASTA feature extraction is motivated by human auditory modeling. The MLP is trained based on an error criterion and is purely data driven. In this work, we analyze the functionality of...
متن کاملFeature Extraction with Combination of HMT-Based Denoising and Weighted Filter Bank Analysis for Robust Speech Recognition
In this paper, we propose a new feature extraction method that combines both HMT-based denoising and weighted filter bank analysis for robust speech recognition. The proposed method is made up of two stages in cascade. The first stage is denoising process based on the wavelet domain Hidden Markov Tree model, and the second one is the filter bank analysis with weighting coefficients obtained fro...
متن کامل